Speech Recognition Using Demi-Syllable Neural Prediction Model

نویسندگان

  • Ken-ichi Iso
  • Takao Watanabe
چکیده

The Neural Prediction Model is the speech recognition model based on pattern prediction by multilayer perceptrons. Its effectiveness was confirmed by the speaker-independent digit recognition experiments. This paper presents an improvement in the model and its application to large vocabulary speech recognition, based on subword units. The improvement involves an introduction of "backward prediction," which further improves the prediction accuracy of the original model with only "forward prediction". In application of the model to speaker-dependent large vocabulary speech recognition, the demi-syllable unit is used as a subword recognition unit. Experimental results indicated a 95.2% recognition accuracy for a 5000 word test set and the effectiveness was confirmed for the proposed model improvement and the demi-syllable subword units.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Modeling context-dependent phonetic units in a continuous speech recognition system for Mandarin Chinese

We study the problem of phonetic modeling for continuous Mandarin speech recognition by providing a systematic performance comparison for systems based on following primitive speech units: syllable, demi-syllable (Initials and Finals), context-independent phones, left-or-right context-dependentphones (diphones), and leftand-right context-dependent phones (triphones). In our speakerdependent con...

متن کامل

Multi-lingual speech recognition based on demi-syllable subword units

Hungarian, unlike English, is an agglutinating language, so new, special methods are needed for speech recognition. The word dictionary could become very large due to its complex morphological system, so a suitable approach could be to use subword units as, eg., half (demi) syllables and language models. In this way most of the natural languages can be described, therefore this method can be ap...

متن کامل

پیش‌بینی قابلیت فهم همخوان‌ها در افراد دارای شنوایی عادی با استفاده از مدل‌های میکروسکوپی دارای معیار فاصله‌ مختلف در بازشناساگر خودکار گفتار

In this study, recognition rates of consonants available in vowel-consonant-vowel structure in hearing tests and two microscopic models will be investigated. Such a syllable structure doesn’t exist in Farsi and Azerbaijani languages, but since the goal is only recognition of middle phoneme, according to hearing tests, listeners are able to properly recognize phonemes in clean speech conditions....

متن کامل

A hierarchical model for syllable recognition

Abstract. Inspired by recent findings on the similarities between the primary auditory and visual cortex we propose a neural network for speech recognition based on a hierarchical feedforward architecture for visual object recognition. When using a Gammatone filterbank for the spectral analysis the resulting spectrograms of syllables can be interpreted as images. After a preprocessing enhancing...

متن کامل

Persian Phone Recognition Using Acoustic Landmarks and Neural Network-based variability compensation methods

Speech recognition is a subfield of artificial intelligence that develops technologies to convert speech utterance into transcription. So far, various methods such as hidden Markov models and artificial neural networks have been used to develop speech recognition systems. In most of these systems, the speech signal frames are processed uniformly, while the information is not evenly distributed ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1990